Chinese Short Text Classification by ERNIE Based on LTC_Block
نویسندگان
چکیده
Short text classification, an important direction of the basic research natural language processing, has extensive applications. Its effect depends on feature extraction methods and representation methods. This paper proposed LTC_Block-based short classification model named ERNIE to classify Chinese texts extract semantics in corpus address polysemy problem text. In this model, LTC_Block, a double-channel structural unit composed BiLSTM TextCNN, was used contextual sequences overall features semantics, residual connection integrate further texts. Experiments two different datasets showed that achieved better than mainstream models, proving its feasibility effectiveness.
منابع مشابه
Chinese Short Text Classification Based on Domain Knowledge
People are generating more and more short texts. There is an urgent demand to classify short texts into different domains. Due to the shortness and sparseness of short texts, conventional methods based on Vector Space Model (VSM) have limitations. To tackle the data scarcity problem, we propose a new model to directly measure the correlation between a short text instance and a domain instead of...
متن کاملShort Text Classification Based on Improved ITC
The long text classification has got great achievements, but short text classification still needs to be perfected. In this paper, at first, we describe why we select the ITC feature selection algorithm not the conventional TFIDF and the superiority of the ITC compared with the TFIDF, then we conclude the flaws of the conventional ITC algorithm, and then we present an improved ITC feature selec...
متن کاملChinese Short-Text Classification Based on Topic Model with High-Frequency Feature Expansion
Short text differs from traditional documents in its shortness and sparseness. Feature extension can ease the problem of high sparseness in the vector space model, but it inevitably introduces noise. To resolve this problem, this paper proposes a high-frequency feature expansion method based on a latent Dirichlet allocation (LDA) topic model. High-frequency features are extracted from each cate...
متن کاملShort Text Classification on Complaint Documents
Indonesian government has developed a system for citizens to voice their aspirations and complaints, which are then stored in the form of short documents. Unfortunately, the existing system employs human annotators to manually categorize the short documents, which is very expensive and time-consuming. As a result, automatically classifying the short documents into their correct topics will redu...
متن کاملAn Arguing Lexicon for Stance Classification on Short Text Comments in Chinese
With the development of social media and online forums, users have grown accustomed to expressing their agreement and disagreement via short texts. Elements that reveal the user’s stance or subjectivity thus becomes an important resource in identifying the user’s position on a given topic. In the current study, we observe comments of an online bulletin board in Taiwan for how people express the...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Wireless Communications and Mobile Computing
سال: 2022
ISSN: ['1530-8669', '1530-8677']
DOI: https://doi.org/10.1155/2022/1411744